!pr3
Symbol Table Source Maker......Peter McInerney and Bruce Love

When developing a very large program in separately assembled stages, it is nice to be able to carry forward the information in the symbol table of one section into the equates section to later section.  You might do this as a normal part of development or as response to a bug detected in an earlier stage which forces some re-assembly.  We designed this utility program to take all the hard work out of the process of building an equate file from a symbol table.

After an assembly, BRUNning the following utility will cause whatever source is in memory to be replaced by a series of .EQ lines constructed from the current symbol table.  All global labels are included, in numerical order.  The generated source lines can be saved or merged in the usual fashion.

The plan of the program falls into three steps.  First the existing symbol table is sorted into numeric order by the value of each symbol.  Next a line corresponding to each symbol is constructed and merged into the source code.  Finally the source lines are renumbered starting with 1000 using an increment of 10, and control is passed back to the S-C Macro Assembler.

We originally wrote our program based on Version 1.1 of the S-C Macro Assembler.  Version 2.0 differs in that each symbol value uses four bytes rather than two, and the RENUMBER routine is in a different location.  Bob Sander-Cederlof added some code to handle Version 2.0, and that version is listed here.  All the changes that need to be made to use our utility with Version 1.1 are controlled by .DO-.ELSE-.FIN sets, so that you only have to change line 1030 to assemble the other version.  Since the following listing was made with the CON listing option, the code between .ELSE and .FIN is shown as non-assembled lines; this allows you to type in both versions of the program.

After an assembly, the symbol table consists of 26 chains of symbols.  A hash table of 26 pointers contains the beginning of each of the 26 chains.  There is one chain for each letter of the alphabet, and symbols are assigned to a chain based on the first letter of the symbol name.  Within each chain, the symbols are linked together in alphabetical order.  The first two bytes of each symbol entry are a forward pointer to the next symbol in the chain, or $0000 if it is the end of the chain.  If there is no chain for a particular letter, that pointer in the hash table will be $0000.

The value of the symbol is in the next two or four bytes (Version 1.1 or 2.0, respectively).  The high byte of the value is first, the low byte last.  The byte following the value contains the length of the symbol name in the lower six bits.  The length will be a number between 1 and 32, or $01 and $20.  Following the length byte are the characters of the name itself.  Some other information is stored in the table, including various flags, local labels, and any macro definitions which were in your program; however, we are not concerned with these in our program.

The program begins by setting the output hook to point to our routine named MYCOUT.  Any characters that are "printed" through the monitor's COUT routine will be routed to MYCOUT, at lines 2980-3070.  MYCOUT merely stores the characters in successive positions of a buffer we put at $280.  Lines 1350-1380 zap any source program still in memory, in preparation for adding the new .EQ lines.

Since every symbol carries a pointer, we decided to simply re-string them on a new chain in numeric order by value.  Lines 1390-2040 build this new chain.  Lines 1390-1490 and 1990-2040 step through each of the 26 alphabetical-order (A-O) chains.  The numerical-order (N-O) chain is built with the pointer in ROOT pointing at the largest value, each symbol's pointer pointing at the next smallest value.  When we find an A-O chain which is not empty, lines 1500-1980 chomp through the chain finding the right place in the N-O chain for each symbol.

Once the symbols are all strung on the N-O chain, lines 2050-2940 use the N-O chain to generate source lines for each symbol.  Lines 2090-2100 check for the possibility of no symbols, just in case you are testing us.

Lines 2110-2210 pick up the value of the symbol (two or four bytes worth) and push it on the stack, low byte first.  The loop actually pushes the byte following the value as well, because it saved a few program bytes to include it in the loop.  Line 2220 pulls that byte back off.

Lines 2220-2280 pick up the characters of the symbol name and "print" them.  Remember that the print hook points to MYCOUT, so that the characters are really placed in WBUF starting at WBUF+3.  (The locations WBUF through WBUF+2 are reserved for the line length and line number.)

Lines 2290-2360 generate enough blanks to tab over to column 25.  If the symbol is longer than 25 characters, only one blank is generated.  All of the blanks are squeezed into a single compressed blank token ($80 + # of blanks).  We put this into WBUF by calling MYCOUT1 to avoid the AND #$7F at the beginning of MYCOUT.

Lines 2370-2420 "print" the string of characters " .EQ $", which are stored in backwards order in line 3090.

Lines 2430-2610 "print" the value of the symbol in hexadecimal.  Since the value may have up to three bytes of leading zeros, there is code here to suppress those bytes.

Lines 2620-2720 terminate the source line in WBUF with a $00 code, and store the line length in the first byte position.  Now the line is ready to be added to the source code being built up.

Lines 2730-2790 make room for the new source line by lowering the pointer PRG.BEG, which points at the start of the source code.  We are adding the source lines starting with the highest value, which will be at the end of the source program, and working down to the lowest value at the beginning of the source program.

Lines 2800-2850 copy the line into the hole we just made.  Note that we have not filled in a valid line number yet.

LInes 2860-2940 promote the ROOT pointer to the next symbol in the N-O chain.  If there are no more symbols, line 2950 calls on the RENUMBER subroutine inside the S-C Macro Assembler to put real line numbers in each line.  The point at which RENUMBER is entered is just after a series of three JSR's, all to the same address.  The instruction we branch to is a "CPX #$06".  We are pointing this out here just in case you have a version of the S-C Macro Assembler with a slightly different position for the RENUMBER subroutine.  Of course, you could omit line 2950 and just remember to type "REN" after running our program.

Finally, line 2960 restores the output hook to the 40-column screen output.  This will not be what you want if you are using an 80-column card.  If you are doing that, we suggest saving the output hook way back at the beginning before stuffing MYCOUT into it, and then restoring the original value here.  We didn't do it that way because we were trying every possible way to make this whole program fit in only one page.

One caveat remains.  We did not include any test to see whether the source code being generated starts to overlap the end of the symbol table.  If you have a gigantic symbol table, say over half of the available memory for source+symbols, you may run into this problem.

When you are using this program, be sure you save the source of whatever you assembled first.  Our program replaces the source in memory with the .EQ source lines.  Also, realize that the symbol table is essentially wiped out by running our program, because all the chain links are restructured for numerical order.  You will have to re-assemble the original program to re-create the original symbol table.  Of course, if you assemble the source lines we generate, you will re-create all the global labels of the original program.

We think you will find many uses for our program, beyond the ones which prompted us to write it.  We are very proud that we managed to fit everything into a single page, but don't let that stop you from adding features to fit your own needs.
1
